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Abstract 

I We prove a variant of a Johnson-Lindenstrauss lemma for matrices with circulant structure. 

. This approach allows to minimise the randomness used, is easy to implement and provides good 

running times. The price to be paid is the higher dimension of the target space k = 0(e~^ log^ n) 
instead of the classical bound k = 0{e~^ logn). 

^ '■ AMS Classification: 52C99, 68Q01 
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1 Introduction 

The classical Johnson-Lindenstrauss lemma may be formulated as follows. 

Theorem 1.1. Let e G (0, ^) and let xi, . . . , j;„ G M"^ be arbitrary points. Let k = O(e^^logn) be 
a natural number. Then there exists a (linear) mapping / : M'^ — t- M'^ such that 



: {1 - e)\\xi - x,\\l < \\f{xi)-f{xj)\\l<{l + e)\\xi-xj\\l 

for all i,j G {1, . . . ,n}. Here || • II2 stands for the Euclidean norm in M.'^ or M.^ , respectively. 

The original proof of Johnson and Lindenstrauss [11] uses (up to a scaling factor) an orthogonal 
projection onto a random /c-dimensional subspace of M'^. We refer also to [7] for a beautiful and self- 
contained proof. Later on, this lemma found many applications, especially in design of algorithms, 
where it sometimes allows to reduce the dimension of the underlying problem essentially and break 
the so-called "curse of dimension", cf. [9] or [lOj . 

The evaluation of /(x), where / is a projection onto a random k dimensional subspace, is a very 
time-consuming operation. Therefore, a significant effort was devoted to 

• minimize the running time of f{x), 

• minimize the memory used, 
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• minimize the number of random bits used, 

• simplify the algorithm to allow an easy implementation. 

Achlioptas observed in [1], that the mapping may also be realised by a matrix, where each com- 
ponent is selected independently at random with a fixed distribution. This decreases the time for 
evaluation of f{x) essentially. 

An important breakthrough was achieved by Ailon and Chazelle in [3]. Let us briefly describe 
their Fast Johnson-Lindenstrauss transform (FJLT). The FJLT is the product of three matrices 
f{x) = PHDx, where 

• P is a A; X d matrix, where each component is generated independently at random. In partic- 
ular, Pij PS N{0, 1) with probability 

g = min|e(i^),l} 

and Pij = with probability 1 — q, 

• H is the d X d normalised Hadamard matrix, 

• D is a random d x d diagonal matrix, with each Di^i drawn independently from {—1, 1} with 
probability 1/2. 

It follows, that with high probability, /(x) may be calculated in time 0{dlogd + qde~^ logn). 

We refer to |14) for a historical overview as well as for an extensive description of the present "state 
of the art" . 

In this note we propose another direction to approach the Johnson-Lindenstrauss lemma, namely 
we investigate the possibility of taking a partial circulant matrix for / combined with a random 
±1 diagonal matrix, see the next section for exact definitions. 

This transform has a running time of O(dlogd), requires less randomness {2d instead of kd or 
{k + l)d used in [U [H [3]) and allows a simpler implementation. 

Unfortunately, up to now, we were only able to prove the statement with k = 0(e~^ log'^ n), 
compared to the standard value k = 0(e~^logn). We leave the possible improvements of this 
bound open for further investigations. 



2 Circulant matrices 

We study the question (which to our knowledge has not been addressed in the literature before), 
whether / in the Johnson-Lindenstrauss lemma may be chosen as a circulant matrix. Let us give 
the necessary notation. 

Let a = (ao, . . . , a-d-i) be independent identically distributed random variables. We denote by Ma^k 
the partial circulant matrix 
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Furthermore, if x = (xq, . . . , Xd-i) are independent Bernoulli variables, we put 

/xo ... \ 

Ki ... 
D^= . . . . 

VO ... 

Theorem 2.1. Let xi, . . . , 6e arbitrary points in , let £ E (0, ^) anc? let k — 0(e ^ log"^ n) be a 
natural number. Let a = (gq, . . . , ad-i) be independent Bernoulli variables or independent normally 
distributed variables. Let Ma^k c,nd be as above and put f{x) = -^Ma^kD^x. 

Then with probability at least 2/3 the following holds 

{1 - e)\\xi - XjWl < \\f{xi) - f{xj)\\l < {I + £)\\xi - XjWl i,j = l,...,n. 

The preconditioning of x using seems to be necessary and we shall comment on this point later 
on. Its role may be compared with the use of the random Fourier transform in [3j. 

In contrast to the above mentioned variants of the Johnson-Lindenstrauss lemma, the coordinates of 
f{x) are now no longer independent random variables. Our approach "decouples" the dependence 
caused by the circulant structure. It resembles in some aspects the methods used recently in 
compressed sensing, cf . [H [5l [15] . 

First, we recall the Lemma 1 from Section 4.1 of [13] (cf. also Lemma 2.2 of [14]), which shall be 
useful later on. 

Lemma 2.2. Let 

D 

Z = ^ai{aj - 1), 
1=1 

where ai are i.i.d. normal variables and ai are nonnegative real numbers. Then for any t > 

F{Z > 2||a||2\/t + 2||a||oot) < exp(-t), 
F{Z < -2||a||2\/t) < exp(-t). 

Furthermore, we shall use the decoupling lemma of [6l Proposition 1.9]. 

Lemma 2.3. Let ^o, ■ ■ ■ be independent random variables with E^o = ■ ■ ■ = = and 

let {xij}^~}^Q be a double sequence of real numbers. Then for 1 < p < oo 





V 

< 4PE 
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where {C'q, ■ ■ ■ denotes an independent copy o/ (^Q; • • • ,S,d-i)- 

The key role in the proof of the Johnson-Lindenstrauss lemma is played by the following estimates. 

Lemma 2.4. Let k < d be natural numbers and let £ G (0, ^). Let a = [oq, . . . ,ad~i), Ma^k o-nd 
D.^ be as in Theorem \2.1\ and let x G M"^ be a unit vector. Put f{x) = Ma^kDi^x. 

Then there is a constant c, independent on k,d,£ and x, such that 

K,.{\\f{x)\\l > (1 + e)k) < exp(-c(fe2)i/3) 

and 

Pa,.(||/(:r)||2 < (1 - £)k) < exp{-c{k£^)'/'). 
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Proof. Let 5 : — t- denote the shift operator 

5(xo,Xi, . . .,Xd-l) = {xd-l,Xo,Xi, . . . ,Xd_2), X G W'-. 

Then 

fc-l fc-l /d-l \ 2 

= ||M,,fc^^x||i = \{S'a,D^x)\^ = 5^a,x,-+,x,+, =/ + //, 

j=0 j=0 \i=0 / 



where 



and 



d-l k-1 
i=0 j=0 



k-1 

1/ = ^ ^ OiOiiKjj^iKj 

Here (and any time later) the summation in the index is to be understood modulo d. 
The decoupling of the circulant matrix is based on 

Ma,kD^x\\l>{l + e)k) <¥a{I > {1 + e/2)k) > £k/2) (2.1) 



(2.2) 



and 

K,.(\\Ma,kD^x\\l < (l-e)A:) < Pa(/ < (1 - e/2)A;) + P,,^(// < -eA:/2). 
We use Lemma 12.21 to estimate the diagonal term /. 

k-1 

We choose Qj = ^ x^^j and get ||a||i = k,\\ 

Q^lloo ^ 1 snd hence ||ck||2 ^ '\/k. This leads to 

j=0 

Pa{I <k- 2Vkt) < exp(-t) 

and 

^a{I>k + 2Vki + 2t) < exp(-t). 
We set ek/2 = 2Vki, i.e. t = e^k/16, in (gj]) and obtain 

Fa(/ < (1 - e/2)A;) < exp(-e2A:/16). 

On the other hand, if c = 5/2 - ^6 > 1/20, then ^ + c/2 = 1/4 and 

2Vki + 2t < ek/2 

for t = ce'^k, which finally gives 

^a{I > (1 + e/2)A;) < exp(-ce2/c). 



(2.3) 
(2.4) 

(2.5) 



(2.6) 



Next, we estimate the moments of the off-diagonal part //. We use Lemma 12.31 twice, which gives 

k-1 p 



where a' and x' are independent copies of a and x, respectively. 



■iXj-\-i' 



First, we make a substitution v = j + i,v' = j + i' and use the Khintchine inequality with the 
optimal constant Cp < ^/p and the random variable x to obtain 



k-l 



j=0 i^i' 



d~\ k-l 
11=0 d'^i; j=0 



Next, we involve Minkowski's inequality with respect to p/2 > 1 and Khintchine's inequality for 
the random variable V. 

,d-l k-l .2\P/2 



d-l 



v=0 



j=0 
k-l 



k-l 



3=0 



2/p\ p/2 



j=0 



av-jO-^'-jj j 



p/2 



Furthermore, the Minkowski inequality for a and a' gives 

, k-l 



If Oo, • • • , ffld-i are Bernoulli variables, then Khintchine's inequality gives 

k-l „N l/p 



j=0 

as the product of two independent Bernoulli variables is again of this type. 

For normal variables, we use first Khintchine's inequality and spherical coordinates to obtain 



k-l 



k-l 



k-l 



Ea,a' 1^ aD-i«^'-j = ^a,a'\'^aja'j < (pKa 



p/2 



j=0 



3=0 



3=0 



-IHIi/2||a||!?da 



(2.7) 



(27r)'=/2 



.ry2^p+k-l. 



where 



A, 



T{k/2) 



is the area of the unit ball in M^. 

We combine ()2.7p with Stirling's inequality and obtain 

k-l l/p 



CLy—jCL^/_j 

< V2cp 
j=o ^ 



r{{k + p)/2) 

Tik/2) 



l/p 



< Cy/p{k +p). 
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Hence, if ao, • • • , ad-i are independent Bernoulli or normally distributed variables, we may estimate 

{^a,a',>.,.'\Il'\''Y^'' < CP - y/ik + p)p-\\x\\^ = Cp""^ ^k + ^. (2.8) 

Markov's inequality then gives 

TP Mrr'i^t./9^ TP ^ 2P|//r ^ ,^ ^ 2P¥.,^a,^^y\iir ^ ( cpy^vk+^y 

^a,a',>,,x'{\II I > ke/2) = Fa,a',Ky ,„ „ > 1 < T^T^; < 



kPeP ~ ) ~ kPeP ~ \ ke I 

We choose p by the condition ^^^^ = e~^. We may assume c > 1, which ensures that p < k and 
^^'^^ < which leads to 

Pa,a',x,>.'(l^^'l > ke/2) < exp(-c'(te2)i/3). (2.9) 
The proof then fohows by and ([22]) combined with (g^D and ([23]). □ 

The proof of Theorem 12.11 follows from Lemma 12.41 bv the union bound over all (2) pairs of points. 

Remark 2.5. (i) We note that (|2.8p follows directly by very well known estimates of moments of 
Gaussian chaos, cf. [S^, T^. We preferred to give a simple and direct proof. 

(ii) Let us also mention that Lemma 12.41 fails, if the multiplication with D;^ is omitted. Namely, 
let fe < d be natural numbers, let ao, . . . ,ad-i be independent normal variables and let x = 
^(1,...,1). Uf{x) = Ma,kX, then 

d-i 2 



ii/wiii^KE^)- 



j=0 

Due to the 2-stability of the normal distribution, the variable 

d-l 

Vd 



j=0 

is again normally distributed, i.e. b ~ iV(0, 1). Hence 

^a(\\f{x)\\l > {l + e)k) =¥Jb^ >{l + e: 



depends neither on k nor on d and Lemma 12.41 cannot hold. 

(iii) The statement of Theorem 12.11 holds also for matrices with Toeplitz structure. The proof is 
literally the same, only notational changes are necessary. 
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